-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: support XDP metadata #6
base: master
Are you sure you want to change the base?
Conversation
An XDP program can use the bpf_xdp_adjust_meta() hook to prepend up to 32 bytes of meta data before the packet. Support this functionality by copying XDP Meta to the local memory after a BPF program execution and pushing in back right before delivering the packet. Signed-off-by: Anton Protopopov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken a quick look at this and it looks like it will work. I would suggest extending the size of the existing metadata region in the packet vector LM rather than defining a new region. The larger region might also mean unrolling the save/restore loops becomes a bit too expensive in terms of code store utilization. Might be worth maintaining the unroll for small amounts of metadata and resorting to looping when there is a large amount of metadata to copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does the metadata get copied into the packet and how are you communicating the length to the driver?
Hi @epeer-juniper, thanks for the comments! The length of XDP metadata is always <= 32 bytes and it is a multiple of four (because of this optimization: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/skbuff.h#n3698), thus I've used 1 and three free bytes in a packet vector structure to specify that xdp meta is present and its length, correspondingly (see changes here: https://github.com/Netronome/nic-firmware/pull/6/files#diff-d8ec451c070a9abe40f206ec0f0ce23aL77) The XDP metadata should immediately precede the packet head. When a BPF program executes, it writes meta data there and stores its length in PV as described above. Then xdp_meta_save() saves it to LM, and just before delivering the packet the
To the kernel the length of the XDP Meta is passed as a new "netronome meta" type in "netronome meta", see this line: https://github.com/Netronome/nic-firmware/pull/6/files#diff-510f9a499850424deb9ad7fc7d93beefR171 Also can you please clarify on where and how you suggest to allocate memory and place XDP meta? To allocate room I am extending the packet vector in LM by 32 bytes, so technically it's not a new region, the PV just grows in size. |
#while (LOOP > 1) | ||
|
||
t/**/LOOP#: | ||
mem[read32, $tr, in_pkt_addr_hi, <<8, xdp_meta_base, 1], ctx_swap[sig_rd] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am guessing we do not have enough xfer registers to indirect_ref xdp_meta_base read this in a block?
This must be the case since this macro is called from a block of code that calls pv_invalidate_cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the case. What is the right way to fix this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can to re-use $__pv_pkt_data[32]. This is used to cache packet header data. When you use pv_save_xdp_meta in ebf_reentry, you can do pv_invalidate_cache(in_vec) afterward. On the next pv_seek the packet header will be re-read.
* 11 | Sequence Number | --- | Seq Ctx | Protocol | 3 | ||
* 11 | Sequence Number | XML | Seq Ctx | Protocol | 3 | ||
* +-------------------------------+-+-+-+---------+---+-+-+-+-+-+-+ | ||
* 12 | TX Host Flags |M|B|Seek (64B algn)|-|Q|I|i|C|c| 4 | ||
* 12 | TX Host Flags |M|B|Seek (64B algn)|X|Q|I|i|C|c| 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes will clash with other work Netronome has incoming. Specifically, we want to grow the Seq Ctx field by one bit, and we have a different use for the X bit. We might have to add another word to the PV, which has a knock-on effect everywhere else. @mrapson-netronome and I are discussing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I was ready to re-implement the layout when pushing the patch. For me ideal would be to have at least 4+ bits to store the length of XDP meta (in words).
XDP meta is now <= 32 bytes and in 4 bytes chunks, so I need at least 4 bits to store the length. Maybe in future it will be extended to 64 or 128 bytes (still in 4 bytes chunks though), so I would like to have 6+ bits, so that we won't have to change firmware/kernel ABI for this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The situation is that we are running out of local memory on the datapath ME's:
This is the layout without your change:
512 i32.me0._PKT_IO_PKT_VEC
256 i32.me0.gro_cli_lm_ctx
32 i32.me0.__actions_sriov_keys
16 i32.me0.nfd_in_ring_info
16 i32.me0.nfd_out_ring_info
2048 i32.me0.EBPF_STACK_BASE
Total used 2880 of 4096B
With your change:
1024 i32.me0._PKT_IO_PKT_VEC
256 i32.me0.gro_cli_lm_ctx
32 i32.me0.__actions_sriov_keys
16 i32.me0.nfd_in_ring_info
16 i32.me0.nfd_out_ring_info
2048 i32.me0.EBPF_STACK_BASE
Total used 3392B of 4096B
You will notice that changing PV_SIZE_LW from 16 to 24 doubles the _PKT_IO_PKT_VEC size. That is because it must be a power-of-2 size. See the alloc logic in pv.uc line 304. The upshot is that there is 8 more words available in the PV that is not shown in the ASCII art. Perhaps you can move your extra fields to this unused section?
If we add to the above the features that Netronome plans to introduce soon:
1024 i32.me0._PKT_IO_PKT_VEC
640 i32.me0.gro_cli_lm_ctx
16 i32.me0.nfd_in_ring_info
16 i32.me0.nfd_out_ring_info
32 i32.me0._PKT_IO_RX_META
256 i32.me0.__actions_keys <- had to "break" alignment
2048 i32.me0.EBPF_STACK_BASE
Total used 4032B of 4096B
You can see that we are really out of LM, (and I have to violate the alignment requirement of symbol __actions_keys). Thus there really isn't space to grow the XDP metadata.
If we need to go above 32B, we'll have to consider alternatives, like maintaining the metadata in CLS, or directly in the packet buffer.
* 16-23 | XDP Metadata | 0-7 <- ACTIVE_LM_ADDR_2 () | ||
* +---------------------------------------------------------------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to lock down LM addr2 for this? Are we sure it's not used elsewhere?
Edit: I think you have to update pv_push and pv_pop as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am allocating data by extending PV_SIZE_LW
to 24 words. Shouldn't pv_push
and pv_pop
work as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe... It certainly looks like it from inspecting the code. (Note to self, write a unit test to confirm this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @aspsk . I left some comments. We (Netronome) still need to run tests on your contribution. Unfortunately the current unit tests in the repo does not cover all the code you are touching. Furthermore we need to discuss the layout of the PV.
Hi @JohanM84 thanks for your comments. Let me know how you would like to proceed with the layout and I will update my patches accordingly. |
An XDP program can use the bpf_xdp_adjust_meta() hook to prepend up to 32 bytes
of meta data before the packet. Support this functionality by copying XDP Meta
to the local memory after a BPF program execution and pushing in back right
before delivering the packet.
Signed-off-by: Anton Protopopov [email protected]